AITopics | alleviating pathological sharpness

The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks

Neural Information Processing SystemsDec-25-2025, 19:31:18 GMT

Normalization methods play an important role in enhancing the performance of deep learning while their theoretical understandings have been limited. To theoretically elucidate the effectiveness of normalization, we quantify the geometry of the parameter space determined by the Fisher information matrix (FIM), which also corresponds to the local shape of the loss landscape under certain conditions. We analyze deep neural networks with random initialization, which is known to suffer from a pathologically sharp shape of the landscape when the network becomes sufficiently wide. We reveal that batch normalization in the last layer contributes to drastically decreasing such pathological sharpness if the width and sample number satisfy a specific condition. In contrast, it is hard for batch normalization in the middle hidden layers to alleviate pathological sharpness in many settings. We also found that layer normalization cannot alleviate pathological sharpness either. Thus, we can conclude that batch normalization in the last layer significantly contributes to decreasing the sharpness induced by the FIM.

alleviating pathological sharpness, name change, normalization method, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Add feedback

Reviews: The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks

Neural Information Processing SystemsJan-26-2025, 02:26:10 GMT

This well-written paper is the latest in a series of works which analyze how signals propagate in random neural networks, by analyzing mean and variance of activations and gradients given random inputs and weights. The technical accomplishment can be considered incremental with respect to this series of works. However, while the techniques used are not new, the performed analysis leads to new insights on the use of batch/layer normalization. In particular, the analysis provides a close look on mechanisms that lead to pathological sharpness on DNNs, showing that the mean subtraction is the main ingredient to counter these mechanisms. While these claims would have to be verified in more complicated settings (e.g. with more complicated distributions on inputs and weights), it is an important first step to know that they hold for such simple networks.

alleviating pathological sharpness, normalization method, wide neural network, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.75)

Add feedback

Reviews: The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks

Neural Information Processing SystemsJan-26-2025, 02:25:59 GMT

The paper is well-written paper and analyzes how signals propagate in random neural networks. It does so by analyzing mean and variance of activations and gradients, given random inputs and weights. The technical contributions are okay, and the analysis leads to new insights on the use of batch/layer normalization.

alleviating pathological sharpness, normalization method, wide neural network

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.79)

Add feedback

The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks

Neural Information Processing SystemsOct-10-2024, 15:00:50 GMT

Normalization methods play an important role in enhancing the performance of deep learning while their theoretical understandings have been limited. To theoretically elucidate the effectiveness of normalization, we quantify the geometry of the parameter space determined by the Fisher information matrix (FIM), which also corresponds to the local shape of the loss landscape under certain conditions. We analyze deep neural networks with random initialization, which is known to suffer from a pathologically sharp shape of the landscape when the network becomes sufficiently wide. We reveal that batch normalization in the last layer contributes to drastically decreasing such pathological sharpness if the width and sample number satisfy a specific condition. In contrast, it is hard for batch normalization in the middle hidden layers to alleviate pathological sharpness in many settings.

alleviating pathological sharpness, normalization, normalization method, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.65)

Add feedback

The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks

Karakida, Ryo, Akaho, Shotaro, Amari, Shun-ichi

Neural Information Processing SystemsMar-18-2020, 23:03:16 GMT

Normalization methods play an important role in enhancing the performance of deep learning while their theoretical understandings have been limited. To theoretically elucidate the effectiveness of normalization, we quantify the geometry of the parameter space determined by the Fisher information matrix (FIM), which also corresponds to the local shape of the loss landscape under certain conditions. We analyze deep neural networks with random initialization, which is known to suffer from a pathologically sharp shape of the landscape when the network becomes sufficiently wide. We reveal that batch normalization in the last layer contributes to drastically decreasing such pathological sharpness if the width and sample number satisfy a specific condition. In contrast, it is hard for batch normalization in the middle hidden layers to alleviate pathological sharpness in many settings.

alleviating pathological sharpness, normalization, normalization method, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)

Add feedback

Filters

Collaborating Authors

alleviating pathological sharpness

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks

Reviews: The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks

Reviews: The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks

The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks

The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks